Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

master merge for 0.5.3 release #1682

Merged
merged 10 commits into from
Aug 13, 2024
Merged

master merge for 0.5.3 release #1682

merged 10 commits into from
Aug 13, 2024

Conversation

rudolfix
Copy link
Collaborator

Description

master merge for 0.5.3 release

sh-rp and others added 10 commits August 5, 2024 00:08
…he loader (#1494)

* add support for starting load jobs as slots free up

* update loader class to devel changes

* update failed w_d test

* reduce sleep time for now

* add first implementation of futures on custom destination

* rename start_file_load to get_load_job

* add first version of working follow up jobs for new loader setup

* require jobclient in constructor for duckdb

* fixes some dummy tests

* update all jobs to have the new run method

* unify file_path argument in loadjobs

* fixes some filepath related tests

* renames job classes for more clarity and small updates

* re-organize jobs a bit more
fix some tests

* fix destination parallelism

* remove changed in config.toml

* replace emptyloadjob with finalized load job

* make sure files are only moved on main thread

* tmp

* wrap job instantiation in try catch block (still needs improvement)

* post devel merge fix

* simplify followupjob creation
assumes followup jobs can always be created without error

* refactor job restoring

* simplify common fields on loadjobs
mark load job vars private

* completely separate followupjobs from regular loadjobs

* unify some more loadjob vars

* fix job client tests

* amend last commit

* fix handling of jobs in loader

* fix a couple more tests

* fix deltalake load jobs

* fix pending exceptions code

* fix partial load tests

* fix custom destination and delta table tests

* remove one unclear assertion for now

* fix clickhouse loadjob

* fix databricks loadjob

* fix one weaviate and the qdrant local tests (hopefully :)

* fix one pipeline test

* add a couple of loader test stubs

* update bigquery load jobs to new format

* fix bigquery resume test

* add additional check to bigquery job resume test

* write to delta tables in single commit
revert all delta table tests to original
ensure delta tables are still executed on a thread

* fix broken filesystem loading

* add some simple jobs tests

* fix recursion problem

* remove a bit of unneded code

* do not open remote connection when creating a load job

* fix weaviate

* post devel merge fixes

* only update load package info if jobs where finalized

* fix two obviously wrong tests...

* create client on thread for jobs

* fix sql_client / job_client vars
improve performance on starting new jobs (tests pending)

* add tests for available slots and update tests for getting filtering new jobs

* clean up complete package condition

* Merge branch 'devel' into feat/continuous-load-jobs

# Conflicts:
#	dlt/destinations/impl/clickhouse/clickhouse.py
#	tests/load/bigquery/test_bigquery_client.py

* improve table-sequential job filtering

* fix resume job test

* fix load job init exceptions tests

* remove test stubs for tests that already exist

* add some benchmark code to loader tests (in progress)

* amend loader benchmark test

* remove job_client from RunnableLoadJob initializer params

* fix bg streaming insert

* fix bigquery streaming insert

* small renaming and logging changes

* remove delta job type in favor of using the reference jobs

* nicer logging when jobs pool is being drained

* small comment change

* test exception in followup job creation

* add tests for followup jobs

* improve dummy tests for better followup job testing

* fix linter

* put sleep amount back to 1.0 while checking for completed load jobs

* create explicit exceptions for failed table chain jobs

* make the large load package test faster

* fix trace test

* allow clients to prepare for job execution on thread and move query tag execution there.

* fix runnable job tests and linter

* fix linter again and remove wrong value from tests

* test

* update detection of pending jobs, will probably break some tests

* fix two tests of pending packages

* fix test_remove_pending_packages test

* switch to docker compose subcommand

* fix compose deployments

* fix test for arrow version in delta tables
* Added incremental configuration to SQL resources

* Minor corrections and fixes.
* add get_delta_tables helper function

* add tests for pipeline methods

* get delta tables from schema instead of job file extension

* add arg to make remote table dirs

* make get_delta_tables work for remote storage

* typo

* make get_delta_tables work for child tables

* replace test util with get_delta_tables

* move pyarrow version check for delta table format

* move pyarrow version check for delta table format again

* import from internal dlt lib

* document pyarrow version requirement for delta table format

* document get_delta_tables helper function
* Raise/warn on incomplete columns in normalize

Raise on not-nullable columns to catch e.g. misspelled merge/primary key key

* Update error msg

* Test for null values

* Lint

* Delete now invalid tests

* Fix common test
* Add dataset_name_normalization option

* Change name to enable_dataset_name_normalization
…pdates docs (#1674)

* adds full ci for motherduck and updates docs

* drops parquet locks from duckdb, matches parquet to columns by name, allows full jsonl loading

* fixes basic job and sql client tests so motherduck+parquet runs

* adds parallel parquet loading test
@rudolfix rudolfix self-assigned this Aug 12, 2024
Copy link

netlify bot commented Aug 12, 2024

Deploy Preview for dlt-hub-docs ready!

Name Link
🔨 Latest commit 5795b14
🔍 Latest deploy log https://app.netlify.com/sites/dlt-hub-docs/deploys/66b9c00abfc1ca000857e63c
😎 Deploy Preview https://deploy-preview-1682--dlt-hub-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@rudolfix rudolfix merged commit 19c41ea into master Aug 13, 2024
56 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants